How to become a superhero 
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We analyze a collaboration network based on the Marvel Universe comic books. First, we consider the system 
as a binary network, where two characters are connected if they appear in the same publication. The analysis 
of degree correlations reveals that, in contrast to most real social networks, the Marvel Universe presents a 
disassortative mixing on the degree. Then, we use a weight measure to study the system as a weighted network. 
This allows us to find and characterize well defined communities. Through the analysis of the community 
structure and the clustering as a function of the degree we show that the network presents a hierarchical structure. 
Finally, we comment on possible mechanisms responsible for the particular motifs observed. 

PACS numbers: 89.65.-s 89.75.Fb 89.75.Hc 



I. INTRODUCTION 

In the last years the physics community has devoted a 
strongeffort to the study and analysis of complex networks [ 1 , 
HI HI Hi- These studies allow for general characterizations 
such as the small world effect or the scale-free property |0] 
which are shared by many systems, including technological, 
biological and social systems 01 HI B HI 0] • Among 
the social networks the so called collaboration networks are 
of particular interest given the availability of large databases 
which allow for extensive statistical analysis, and also since 
the connections between the vertices, which represent individ- 
uals, can be precisely defined. Two well known examples of 
collaboration networks are the movie actors network, where 
two actors are connected if they appear in the same movie 
and scientific collaboration networks, where two sci- 
entist are connected if they are authors in the same publica- 
tion |S § on m . 

Perhaps one of the most challenging problems in the char- 
acterization of these systems is the determination of commu- 
nities, which can be vaguely defined as groups of nodes which 
are more connected among themselves than with the rest of 
the network 11511 . Through their identification and 

analysis one can search for fundamental laws in social interac- 
tions In this article we will work along 
this line, and show that the determination and characterization 
of the communities allow us to detect mechanisms responsible 
for the particular motifs observed. In particular we will focus 
on the Marvel Universe (MU) [22] which is a fictional cos- 
mos created by the Marvel Comics book publishing company. 
The idea of a common Universe allows characters and plots to 
cross over between publications, and also makes continuous 
references to events that happen in other books. In this Uni- 
verse real world events are mixed with science fiction and fan- 
tasy concepts. An interesting question that arises is if this net- 
work, whose nodes correspond to invented entities and whose 
links have been created by a team of writers, resembles in 
some way real-life social networks, or, on the contrary, looks 
like a random network. This issue has been addressed by Al- 
berich, Miro- Julia and Rossello (AMJR) rt23ll . which used in- 
formation from the Marvel Chronology Project database l24ll . 
to build a bipartite collaboration network. They obtained a 



network formed by 6486 characters and 12942 books, where 
two characters are considered linked if they jointly appear in 
the same comic book. AMJR found that the MU looks almost 
as a real social network, since it has most of, but not all, the 
characteristics of real collaboration networks such as movie 
actors or scientific collaboration networks. In particular, the 
average degree of the MU is much smaller than the theoreti- 
cal average degree of the corresponding random model, thus 
indicating that Marvel characters collaborate more often with 
the same characters. Also, the clustering coefficient is smaller 
than what is usual in real collaboration networks. Finally, the 
degree distribution presents a power law with an exponential 
cutoff, P(k) ~ /fc -T 10~*/ c with an exponent x = 0.7158. Since 
T is much smaller than 2 the average properties of the network 
are dominated by the few actors with a large number of col- 
laborators, indicating that some superheroes such as Captain 
America or Spider Man present much more connections than 
would be expected in a real life collaboration network l23ll . 



II. DEGREE CORRELATIONS 

We begin our study by presenting an analysis of the de- 
gree correlations which fully reveals the artificial nature of 
the MU. Most real social networks are assortatively mixed by 
degree, that is, vertices with high degree tend to be connected 
to vertices with high-degree while vertices with low degree 
tend to be connected to vertices with low degree SHU. On 
the other hand, most biological and technological networks 
present a disassorative mixing by degree, where vertices with 
high degree tend to be connected to vertices with low-degree 
Il25ll . The degree correlations of the network can be analyzed 
by plotting the mean degree (k nn ) of the neighbors of a vertex 
as a function of the degree k of that vertex 12711 . A positive 
slope indicates assortative mixing, while a negative slope sig- 
nals disassortative mixing on the degree. To begin our anal- 
ysis of the MU we consider the system as a binary network, 
where two characters are connected if they appear in the same 
publication. In order to compare with their results we use the 
data compiled by AMJR [28]. In Fig. [TJwe show in small 
circles the different values of k nn obtained in the MU for a 
given degree k, while the black continuous line shows the av- 
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FIG. 1: Mean degree (k nn ) of the neighbors of a vertex as a function 
of the degree k. Circles show the different values of k nn for a given 
k. The continuous line indicates the average value (k nn ). 



FIG. 2: Weight distribution P(w) vs. w (circles). The straight line is 
a power law fit, P(w) ~ w~ 2 ' 26 . 



erage value (k nn ). For k < 10 the dispersion in the values 
of £„„ is too large, ranging from a few to more than a thou- 
sand, thus not allowing for any characterization of (k„„). As 
k increases the dispersion in the values of k nn diminishes in 
a funnel like shape, while (k nn ) presents fluctuations around 
a constant value, indicating that no correlations dominate up 
to k Kt 200. For k > 200 a decreasing behavior of (k nn ) can 
be clearly observed, and the tail seems to follow a power law 
behavior (k nn ) = k~ v with v 0.52. This result shows that, in 
contrast to what is observed in most real social networks, the 
MU network presents a disassortative behavior on the degree. 
Surprisingly, the value of the exponent is similar to the one 
observed in a real technological network: the Internet, where 
an exponent v « 0.5 has been also found l27ll . 

The origin of a disassortative behavior in the Internet is 
most probably given by the fact that the hubs (nodes with 
the largest degree) are connectivity providers, and thus have 
a large number of connections to clients that have only a sin- 
gle connection. In the MU the small exponent in the degree 
distribution observed by AMJR [23] and the disassortative be- 
havior clearly indicate the presence of hubs. One immediately 
wonders what is the role that they play in the MU. Perhaps the 
most intuitive idea to answer this question is to make a list 
that takes into account the degree of the characters or to count 
the number of publications in which they appear. A first step 
in this direction was already taken by AMJR that point out 
that Captain America is the superhero with more connections 
and Spider Man is the one that appears in the largest num- 
ber of comic books rf23tl . Clearly, these classifications help us 
to establish how popular these characters are. However, they 
do not give information on where to establish a cut-off in the 
ranking list. Also, when considering the interactions between 
the characters, one is left with the problem on how to deal 
with the large number of connections of these hubs. In the 
following section we tackle these issues. 



III. THE MARVEL UNIVERSE AS A WEIGHTED 
NETWORK 

In order to advance a step further in the analysis of the 
MU, we take into account the fact that some characters ap- 
pear repeatedly in the same publications. The incorporation 
of this information allows us to distinguish connections be- 
tween characters which truly represent a strong social tie, such 
as a connection between two characters that form a team, to 
those connections that link two characters that perhaps have 
met only once in the whole history of the MU. To define the 
strength Wij of the ties between characters / and j we use the 
weight measure proposed by Newman II 1 Qfl : 
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where 8f is equal to one if character i appears in book k and 
zero otherwise, and is the number of characters in book k. 

In Fig. |2]we present the weight distribution P(w) of the MU 
network, which can be fitted by a power law, P(w) ~ with 
7 = 2.26. A power law behavior in weight distributions has 
also been observed in real scientific collaboration networks 
such as the cond-mat network (7 = 3.7 ± 0.1) and the astro- 
ph network (7 = 4.0 ±0.1) 12911 . However, we must point out 
that in the MU the distribution extends over more than two 
decades, while in the real collaboration networks the distribu- 
tions reach one decade only. Also, the value of the exponent 
in the MU is much smaller. As a consequence, a small fraction 
of the interactions are very strong, while the majority interact 
very weakly. Again we find a result that highlights the leading 
role of a few characters. In this case this is reflected in the fact 
that they interact more frequently than other characters do. 

In order to use the information of the weights to find and 
characterize the role of these leading characters we set a 
threshold on the weight and consider only those interactions 
with a value above the threshold. When we set the threshold 
to its highest possible value, in order to leave just the link with 
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the largest weight, we find that the connection between Spider 
Man with his girlfriend (later wife) Mary Jane Watson Parker 
is the strongest in the MU lf30h . When the threshold is low- 
ered, small groups of nodes form and eventually communities 
begin to appear. In Fig. [3ja) we show the Marvel Universe 
when the 220 links with the largest wy are considered I3TI1 . 
Since there are characters with more than one connection, the 
network is has only 130 vertices. Four large communities can 
be clearly distinguished, while the rest of the characters ap- 
pear connected in isolated pairs, or forming very small groups. 
In these communities two patterns of interconnections seem to 
dominate. On one hand some characters form tightly knitted 
groups, as the community on top which includes the charac- 
ter Beast (B) and corresponds to the X-men. On the other 
hand, star shaped structures dominated by a central charac- 
ter can also be clearly distinguished. These central characters 
are popular characters such as Spider-Man (SM) or Captain 
America (CA). 

As the threshold in the weight is lowered further links be- 
tween communities appear, and eventually a giant compo- 
nent emerges, as Fig. [3jb) shows when 300 links have been 
added [31]. In order to characterize the growth of the network 
as the threshold is lowered, we calculate the fraction of sites in 
the largest component f s / c . Fig. |4]shows the / s / c as a function 
of the number of links added in decreasing weight order. A 
sharp transition can be observed between 200 and 300 links, 
where f s [ c jumps from less than 0.3 to 0.7. After the jump 
the giant component presents a slow and almost monotonous 
growth, and eventually reaches a saturation value when ap- 
proximately 40.000 links are added. 

Notice that a very small fraction of links (w 0.001) are nec- 
essary for the giant component to emerge. This result suggests 
a behavior similar to a random network. In fact, Callaway et 
al. 13211 have shown that if one considers a random network 
with a truncated power law degree distribution, such as the 
one observed in the MU, then the percolation threshold is also 
very small. However, if one does not take the weights into 
account and chooses the links in random order, thus erasing 
all correlations, a qualitatively different behavior is observed. 
Fig. [5]compares the behavior of f s [ c for the MU and a typical 
realization obtained when the links are chosen randomly as a 
function of the number of links added. The inset shows the 
behavior of / s ; c in a log-log plot. Note that when the links are 
chosen at random the f s i c presents regions that decay follow- 
ing a power law close to 1 jx. This behavior reveals that new 
incorporated links do not form part of the largest component. 
They enter connecting isolated pairs of vertices or form part 
of a smaller group. When one new link connects a vertex or 
a group to the largest component, a jump in f s i c is observed. 
Eventually a giant component emerges and then a monotonous 
growth is observed. 

The growing behavior observed in Figs. [3] and [4] where 
communities combine to form larger but less cohesive struc- 
tures strongly suggests that the MU network has a hierarchical 
structure 133113411 . Hierarchical networks integrate both mod- 
ular and scale-free structure, and can be characterized quanti- 
tatively by the scaling law of the clustering coefficient 

C{k)~k- 1 (2) 




FIG. 3: a) Network after the addition of 220 links. The initials cor- 
respond to characters that play an important role connecting com- 
munities: Spider-Man (SM), Thing (T), Beast (B), Captain America 
(CA), Namor (N), Hulk (H). b) Network after the addition of 300 
links, when a giant component has emerged. The black (white) cir- 
cles indicate characters labeled as heroes (villains). The gray circles 
indicate other type of characters, such as people, gods or nodes with 
no classification. 

where C is the measure proposed by Watts and Strogatz 
1 N 1 N In- 

Here N is the number of vertices, is the degree of vertex i, 
and rtj is the number of links between the fc; neighbors of i. In 
Fig.|6]we present the behavior of C(k) as a function of degree k 
for the MU network. The figure clearly shows the coexistence 
of a hierarchy of nodes with different degrees of clustering. 
In particular, the nodes with a smaller degree present a higher 
clustering than those with a larger degree, and the decay of 
C(k) can be bounded by a 1 jk behavior as the dashed line 
shows. 

Ravasz and Barabasi lf34ll note that the presence of a hier- 
archical architecture gives a new interpretation to the role of 
hubs in complex networks: while the nodes with small de- 
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FIG. 4: Fraction of sites in the largest component f s i c as a function 
of the number of links added in decreasing weight order. The inset 
shows in detail the transition between 200 and 300 links. 
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FIG. 5: Fraction of sites in the largest component f s j c as a function 
of the number of links added when the links are chosen in decreasing 
weight order (circles) and also when they are chosen in random order 
(squares). The inset presents a detail for short times in a log-log plot. 
The dashed line shows the power law decay 1/x as a guide to the eye. 

gree are part of densely interlinked clusters, the hubs play the 
important role of bringing together the many small communi- 
ties of clusters into a single, integrated network. In fact, this 
seems to be the case in the MU network where, as we show 
in the following paragraphs, the most popular superheros play 
the role of connecting the communities. 

The sharp jump that marks the appearance of the giant com- 
ponent in the MU (see Fig. |4|i shows that the characters that 
appear repeatedly in the same publications form a well defined 
group. As a consequence, a criterion for setting a cut-off can 
be well defined in an analogy with percolation transitions. If 
the threshold is set too high then the giant component breaks 
into many small components, such as the star shaped and the 
tightly knitted groups. If, on the other hand, the threshold is 
set too low, the slow, almost monotonous growth of the gi- 
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FIG. 6: Clustering C(k) as a function of the degree k. The dashed 
line shows the power law l/k as a guide to the eye. 

ant component shows that new characters are incorporated di- 
rectly to those already present in the giant component. As a 
consequence, as more links are added the division in commu- 
nities is harder to determine. Thus, we focus our analysis of 
the system close to the transition. 

The inset in Fig. |4]clearly shows that the transition between 
200 and 300 links can be subdivided in three smaller jumps. 
The events, numbered as 1, 2 and 3 correspond to the links 
between the following characters: 

1 . Captain America (CA) - Beast (B) 

2. Captain America (CA) - Thing (T) 

3. Hulk (H) - Namor McKenzie (N) 

They are all members of "The Avengers", a team of Earth's 
mightiest heroes, "...formed to fight the foes no single hero 
could withstand" 13511 . It is worth stressing that although these 
characters form a team, each one clearly belongs to a different 
community (see Fig. [3jb)). Note that all the central characters 
in each community are linked to other communities. Also, the 
central characters tend to be connected between themselves, 
forming what is known as a rich-club 13611 . In a rich-club the 
nodes are rich in the sense that they have a large degree. In 
the MU the rich nodes also share another property: they are 
all also "heroes". In fact, as Fig. [3]T>) shows, if one labels 
the characters as heroes or villains [37] one finds that all the 
central characters are heroes, while most of the characters that 
surround them are villains. It is also worth stressing that the 
villains in each community are not connected to villains in 
other communities. 

In his work on the network of collaboration among rappers, 
Smith notes that "New rap acts often feature prominent names 
on their most popular singles and first albums in order to help 
attract listeners unfamiliar with them or their style" M38I1 . Per- 
haps a similar mechanism is present in the MU, where new 
characters are presented next to popular characters so that 
they may be noticed. This clearly will increase the number of 
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connections of the most popular characters leading to a rich- 
get-richer tendency that is reflected in their large number of 
connections. Also since popular characters are also part of 
a team they appear repeatedly together thus forming a rich- 
club. However, there is another ingredient that should also be 
taken into account, since, as the characterization of the nodes 
reveals, only heroes team up, while villains do not. 

We believe that the origin of this division is due to the 
fact that, although the Marvel Universe incorporates elements 
from fantasy and science-fiction the arguments of the stories 
were restricted by a set of rules established in the Comics Au- 
thority Code of the Comics Magazine Association of America 
Il39ll . In particular, rule number five in part A of the code for 
editorial matter states that " Criminals shall not be presented 
so as to be rendered glamorous or to occupy a position which 
creates the desire for emulation", and rule number four in part 
B of the same section states that "Inclusion of stories dealing 
with evil shall be used or shall be published only where the 
intent is to illustrate a moral issue and in no case shall evil be 
presented alluringly, nor so as to injure the sensibilities of the 
reader". As a consequence villains are not destined to play 
leading roles. Also, rule number six in part A states that " In 
every instance good shall triumph over evil and the criminal 
punished for his misdeeds". We believe that teams of heroes 
are formed as a consequence of this rule. In fact, since the 
heroes will always eventually win, it is necessary for them to 
show at least that some effort is necessary, and thus they need 
to collaborate and cooperate with other superheroes in order 
to finally defeat their enemies. 

IV. CONCLUSIONS 

Summarizing, we analyzed the MU as a collaboration net- 
work. First we defined the system as a binary network, where 
a connection between two characters is either present or ab- 
sent. We found that in contrast to most real social networks the 
MU is a disassortative network, with an exponent very simi- 
lar to the one observed in a real technological network, the 



Internet. Then, we used a weight measure to analyze the sys- 
tem as a weighted network. This allowed us to distinguish in- 
teractions between characters that appear repeatedly together 
to those interactions between characters that meet only few 
times. We observed that the weight distribution presents a 
power law behavior, and thus a small fraction of the interac- 
tions are very strong, while the majority interacts very weakly. 
By setting a threshold on the weight we were able to show that 
the characters that appear repeatedly in the same publication 
form a well defined group. Through the characterization of 
the community structure and also analyzing the clustering as 
a function of the degree, we showed that the network presents 
a hierarchical structure. We also analyzed the role of the 
hubs, and have shown that these characters form a rich-club 
of heroes that connect different communities. On the other 
hand characters labeled as villains appear around the hubs and 
do not connect communities. We discussed possible mecha- 
nisms that lead to these effects. In particular the rules of the 
Comic Authority Code clearly limit the role of villains. Also, 
we believe that heroes need to team up in order to show that 
some effort is necessary to defeat their enemies, since there 
is a rule that states that in the end always good shall triumph 
over evil. Finally, we note that a gender classification reveals 
that all the central characters are males, and, as in the case of 
villains, the female characters do not play a role connecting 
communities. However, as was already noted, the strongest 
link in the MU is the relation between Spider Man and Mary 
Jane Watson Parker, a fact that shows that although the MU 
deals mainly with superheroes and villains the most popular 
plot is a love story. 
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